feat: exponential backoff retry for transient SDK errors#117
feat: exponential backoff retry for transient SDK errors#117sogadaiki wants to merge 3 commits intoRichardAtCT:mainfrom
Conversation
AI秘書まいの人格をTelegram Botに統合し日本語対話を実現 - config/persona/mai.md: ペルソナ定義 - src/bot/i18n.py: 辞書ベース軽量i18n (ja/en) - settings.py: persona/knowledge/effort/permission_mode設定追加 - sdk_integration.py: ペルソナ読み込み+SDKオプション - orchestrator.py/auth.py/core.py: UIメッセージi18n化 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
SDK stream callbackでThinkingBlockのみのcontentをstr()変換して表示していた問題を修正 - sdk_integration.py: ThinkingBlockをスキップ、fallbackでも表示可能ブロックのみ通す - orchestrator.py: [ThinkingBlock(で始まるテキストを進捗表示から除外 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add retry logic to ClaudeSDKManager.execute_command() for transient network errors (CLIConnectionError, asyncio.TimeoutError). MCP-related and CLINotFoundError are excluded from retries. Defaults: 3 retries, 1s base delay, 3x backoff (1s → 3s → 9s), configurable via CLAUDE_RETRY_MAX_ATTEMPTS, CLAUDE_RETRY_BASE_DELAY, CLAUDE_RETRY_BACKOFF_FACTOR. Set max_attempts=0 to disable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
PR Review Summary
What looks good
Issues / questions
Suggested tests
Verdict — Friday, AI assistant to @RichardAtCT |
RichardAtCT
left a comment
There was a problem hiding this comment.
Thanks for working on this! A few things need to be addressed before we can merge:
- CI is failing — please fix the test failures
- Unrelated files bundled — this PR includes i18n and persona config changes that aren't related to the retry logic. Please remove those and submit them as separate PRs if desired.
- Rebase needed — please rebase against current main to resolve any drift
Once cleaned up, happy to re-review!
|
Closing in favor of #127, which implements the same exponential backoff retry logic with passing CI and a more focused scope (no bundled i18n/persona changes). Thank you for the initial implementation — the retry pattern you established informed the final version. If you'd like to contribute the i18n/persona features separately, we'd welcome a focused PR for those. |
Summary
Closes #60
ClaudeSDKManager.execute_command()for transient network errors (CLIConnectionError,asyncio.TimeoutError)CLINotFoundErrorare excluded from retries (configuration issues, not transient)CLAUDE_RETRY_MAX_ATTEMPTS(default 3, 0=disabled),CLAUDE_RETRY_BASE_DELAY(default 1.0s),CLAUDE_RETRY_BACKOFF_FACTOR(default 3.0x)Changes
src/utils/constants.pysrc/config/settings.pysrc/claude/sdk_integration.py_is_retryable_error()helper + retry loop inexecute_command()tests/unit/test_claude/test_sdk_integration.pyTestRetryLogic)Retry decision table
asyncio.TimeoutErrorCLIConnectionErrorCLIConnectionError(MCP/server)CLINotFoundErrorProcessErrorCLIJSONDecodeErrorTest plan
TestRetryLogic)